perm filename INTRO[0,BGB]7 blob
sn#105706 filedate 1974-06-09 generic text, type C, neo UTF8
COMMENT ā VALID 00006 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 TITLE PAGE - 1. AUGUST 1974.
C00005 00003 CONTENTS:
C00006 00004 TITLE PAGE - 2. AUGUST 1974.
C00009 00005 0.0 INTRODUCTION.
C00012 00006
C00017 ENDMK
Cā;
TITLE PAGE - 1. AUGUST 1974.
draft - draft - draft - draft - draft - draft - draft - draft - draft
GEOMETRIC MODELING FOR COMPUTER VISION.
BRUCE G. BAUMGART
ABSTRACT:
The main idea of this thesis is that a 3-D geometric model of
the physical world is an essential part of a general purpose vision
system. Such a model provides a goal for descriptive image analysis,
an origin for image synthesis (for verification), and a context for
spatial problem solving. Some of the design ideas to be presented
have been implemented in two programs named GEOMED and CRE; the
programs are demonstrated in situations involving relative camera
motion.
---------------------------------------------------------------------
This research was supported in part by the Advanced Research
Projects Agency of the Office of the Secretary of Defense under
Contract No. SD-183.
The views and conclusions contained in this document are those of
the author and should not be interpreted as necessarily representing
the official policies, either expressed or implied, of the Advanced
Research Project Agency or the United States Government.
CONTENTS:
{INTRO} 0. INTRODUCTION.
{GEM} 1. GEOMETRIC MODELING THEORY.
{WINGED} 2. THE WINGED EDGE POLYHEDRON REPRESENTATION.
{GEOMED} 3. GEOMED AS A GEOMETRIC MODELING COMMAND LANGUAGE.
{BIN} 4. A POLYHEDRON INTERSECTION ALGORITHM.
{OCCULT} 5. HIDDEN LINE ELIMINATION FOR COMPUTER VISION.
{CNTOUR} 6. VIDEO IMAGE CONTOURING.
{CMPARE} 7. IMAGE COMPARING.
{CAMERA} 8. CAMERA SOLVING.
{VIS} 9. COMPUTER VISION THEORY.
{CONCLU} 10. CONCLUSION.
APPENDICES:
{REF} REFERENCES.
{GNODES} GEOMED NODE FORMATS.
{CNODES} CRE NOOE FORMATS.
TITLE PAGE - 2. AUGUST 1974.
GEOMETRIC MODELING FOR COMPUTER VISION.
---------------------------------------------------------------------
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
---------------------------------------------------------------------
BY
BRUCE G. BAUMGART
AUGUST 1974
---------------------------------------------------------------------
DETAILED TABLE OF CONTENTS.
LIST OF BOXES.
LIST OF FIGURES.
---------------------------------------------------------------------
ACKNOWLEDGEMENTS
Thesis Adviser: John Mc Carthy
Readers: Jerome A. Feldman
Donald E. Knuth
Alan C. Kay
People:
Jerry Agin,
Leona Baumgart,
Tom Binford,
Jack Buchanan,
Les Earnest,
Tom Gafford,
Steve Gibson,
Ralph Gorin,
Tovar Mock,
Andy Moorer,
Hans Moravec,
Richard Orban,
Ted Panofsky,
Lou Paul,
Lynn Quam,
Jeff Raskin,
Ron Rivest,
Irwin Sobel,
Robert Sproull,
Ivan Sutherland,
Dan Swinehart,
Russel Taylor,
Marty Tenenbaum,
Arthur Thomas,
0.0 INTRODUCTION.
"For the purpose of presenting my argument I must first
explain the basic premise of sorcery as don Juan presented it to me.
He said that for a sorcerer, the world of everyday life is not real,
or out there, as we believe it is. For a sorcerer, reality or the
world we all know, is only a description. For the sake of validating
this premise don Juan concentrated the best of his efforts into
leading me to a genuine conviction that what I held in mind as the
world at hand was merely a description of the world; a description
that had been pounded into me from the moment I was born."
- Carlos Castaneda. Journey to Ixtlan.
This thesis is about computer techniques for handling 3-D
geometric descriptions of the world; the world that can be visually
perceived with a television camera. The overall design idea may be
characterized as an inverse computer graphics approach to computer
vision. In computer graphics, the world is represented in sufficient
detail so that the image forming process can be numerically simulated
to generate synthetic television images; in the inverse, perceived
television pictures (from a real TV camera) are analysed to compute
detailed geometric models. For example, the polyhedron in figure **
was automatically computed from eight views of a plastic horse on a
turntable. It is hoped, that visually acquired 3-D geometric models
can be of use to other robotic processes such as manipulation,
navigation or recognition.
Once acquired, a 3-D model can be used to anticipate the
appearance of an object in a scene making feasible a quantitative
form of vision by verification (feedback vision). For example, the
predicted video appearance of the two machine parts depicted in
figure ** can be computed, figure **, and compared with an actual
perceived video image, figure **. By comparing the predicted image
with a perceived image, the correspondence between features of the
internal model and features of the external reality can be
established (figure **), the precise location of the parts and the
camera can be measured, and new phenomena can be detected, such as
the little black cube in the lower left of the perceived image.
The chapters proceed from theory, through implementation,
and back to theory; with the first five chapters dealing with
modeling and the last five chapters dealing with vision. The theory
consists of two essays: the first, on geometric modeling in chapter
one and the second on vision in chapter nine. The implementation
consists of two programs named GEOMED and CRE. CRE is a solution to
the problem of finding intensity contours in a sequence of television
pictures and of linking corresponding contours between pictures.
GEOMED is a system of 3-D modeling routines with which arbitrary
polyhedra may be constructed, altered, or viewed in perspective
with hidden lines eliminated. It is a sequence of GEOMED operations
that generates new object descriptions from contour images.